library(topGO)
library(knitr)   # to use kable()
library(limma)   # to use vennDiagram()
library(ggplot2)
TAG <- params$TAG
VAR <- params$VAR
ANNOTATION <- list(genes = '../2019-07-26/genes/annotation.txt',
                   isoforms = '../2019-07-26/transcripts/annotation.txt')
TAGDIR <- paste0('../2020-01-08/', TAG, '/')

Introduction

This is the enrichment analysis for isoforms regulated by hatching. Because I quantified the association of expression with hatching in two different ways, this document has two sections. Section Variance reports the enrichment analysis performed with isoforms ordered by the proportion of expression variance explained by hatching. Note that some isoforms with low variance may still be highly associated with hatching, even if the fold change between levels of this factor is low. Section Differential expression uses an ordering of isoforms based on the significance of the differential expression between levels of hatching, which does depend on fold change.

Reading the data

Functional annotation is in 2019-07-26. I will also upload two lists of isoforms, with either proportion of variance explained by hatching or p-value of differential expression test.

tagVariance <- read.table(paste0(TAGDIR, VAR, '_variance.txt'))
tagPValue   <- read.table(paste0(TAGDIR, VAR, '_pvalue.txt'))
annotation  <- read.table(ANNOTATION[[TAG]], col.names = c('tagname', 'goterms'))

To initialize the topGOdata object, I will need the gene list as a named numeric vector, where the names are the isoforms identifiers and the numeric values, either the portion of variance explained by hatching or the p-values of the differential expression test. The structure() function adds attributes to an object.

Variance <- structure(tagVariance[,1], names = row.names(tagVariance))
PValues  <- structure(tagPValue[,1],   names = row.names(tagPValue))
rm(tagVariance, tagPValue)

There are 33037 isoforms scored with a variance portion and a differential expression p-value. It should actually be the exact same isoforms. The annotation data frame has more than one GO term for every tag, separated by |. I need a named list.

head(annotation)
##          tagname
## 1 TCONS_00000002
## 2 TCONS_00000010
## 3 TCONS_00000011
## 4 TCONS_00000016
## 5 TCONS_00000017
## 6 TCONS_00000018
##                                                             goterms
## 1                                  GO:0008417|GO:0016020|GO:0006486
## 2                                  GO:0043130|GO:0005515|GO:0043161
## 3                       GO:0005840|GO:0015935|GO:0003735|GO:0006412
## 4 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
## 5 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
## 6 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
allgenes2GO <- strsplit(as.character(annotation$goterms), "|", fixed = TRUE)
names(allgenes2GO) <- annotation$tagname
rm(annotation)

There are 19735 isoforms with GO annotations. But the differential expression analysis includes many more isoforms. In order to include the not-annotated isoforms in the enrichment analysis, to see if annotated or not annotated isoforms are more or less often differentially expressed, I should attribute a GO term to them. According to [http://geneontology.org/docs/faq/] nowadays we express lack of annotation by annotating to the root nodes, i.e. GO:0008150 biological_process, GO:0003674 molecular_function, and GO:0005575 cellular_component.

for (tag in unique(c(names(PValues), names(Variance)))) {
   if (! tag %in% names(allgenes2GO)) {
      allgenes2GO <- append(allgenes2GO,
         structure(list(c("GO:0008150", "GO:0003674", "GO:0005575")), names = tag))
   }
}

Using differential expression p-values

Building the topGO object

Creation of a topGO dataset is documented in section 4 of topGO’s the user manual: https://bioconductor.org/packages/release/bioc/vignettes/topGO/inst/doc/topGO.pdf. I need to use the new command, and fill up the slots. The annot object must be a function that compiles “a list of GO terms such that each element in the list is a character vector containing all the gene identifiers that are mapped to the respective GO term.” There are several options, that you can check using help(annFUN.gene2GO), for example. The annFUN.gene2GO requires a gene2GO argument, which is the list of gene-to-GO terms I made before. The geneSelectionFun object is a function that selects the interesting (most significant) genes. It is required to perform Fisher’s exact test. The nodeSize is used to prune the GO hierarchy from the terms which have less than n annotated genes.

I generate three datasets, to analyse each of the three ontologies.

selection <- function(allScores) {return(allScores < 0.05)}
GOdata.BP <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'BP', 
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdata.MF <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'MF',
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdata.CC <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'CC',
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

DataSummary <- data.frame(ontology = c('BP', 'MF', 'CC'),
   Num_Genes = sapply(list(GOdata.BP, GOdata.MF, GOdata.CC), numGenes),
   Num_GO_terms = sapply(list(GOdata.BP, GOdata.MF, GOdata.CC), function(x) length(usedGO(x))))
kable(DataSummary, caption='Number of feasible genes or transcripts and number of GO terms used in each data set.')
rm(DataSummary)
Number of feasible genes or transcripts and number of GO terms used in each data set.
ontology Num_Genes Num_GO_terms
BP 26235 1582
MF 30580 791
CC 21616 390

Running the tests

There are more than one way to test for enrichment. Something that took me a while to understand is that not only there are different test statistics (Fisher’s exact test, Kolmogorov-Smirnov, and others) but also different algorithms: classic, elim, weight… The algorithms are ways to deal with the dependence structure among GO terms due to topology. Some algorithms are compatible with all statistics implemented in topGO. But the weight and the parentchild algorithms are only applicable to Fisher’s exact test. I am not interested in the classic algorithm, which treats GO nodes as independent, and therefore produces an excess of significant results. I will not use the Fisher’s exact test, because it dependes on an arbitrary threshold of significance on non-adjusted p-values.

BP.elim     <- runTest(GOdata.BP, algorithm = "elim",     statistic = "ks")
BP.weight01 <- runTest(GOdata.BP, algorithm = "weight01", statistic = "ks")
BP.lea      <- runTest(GOdata.BP, algorithm = "lea",      statistic = "ks")
MF.elim     <- runTest(GOdata.MF, algorithm = "elim",     statistic = "ks")
MF.weight01 <- runTest(GOdata.MF, algorithm = "weight01", statistic = "ks")
MF.lea      <- runTest(GOdata.MF, algorithm = "lea",      statistic = "ks")
CC.elim     <- runTest(GOdata.CC, algorithm = "elim",     statistic = "ks")
CC.weight01 <- runTest(GOdata.CC, algorithm = "weight01", statistic = "ks")
CC.lea      <- runTest(GOdata.CC, algorithm = "lea",      statistic = "ks")

ResultsSummary <- data.frame(ontology = rep(c("BP", "MF", "CC"), each = 3),
   algorithm = rep(c("elim", "weight01", "lea"), 3),
   TermsTested = sapply(list(BP.elim, BP.weight01, BP.lea, MF.elim, MF.weight01, MF.lea, CC.elim, CC.weight01, CC.lea), function(X) length(score(X))),
   Significant = sapply(list(BP.elim, BP.weight01, BP.lea, MF.elim, MF.weight01, MF.lea, CC.elim, CC.weight01, CC.lea), function(X) sum(score(X) < 0.01)))

kable(ResultsSummary, caption="Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.")
Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.
ontology algorithm TermsTested Significant
BP elim 1582 22
BP weight01 1582 10
BP lea 1582 34
MF elim 791 10
MF weight01 791 13
MF lea 791 13
CC elim 390 2
CC weight01 390 1
CC lea 390 3
rm(ResultsSummary)

Results

The topGO package does not pay much attention to what terms are significant because they are overrepresented and which ones are underrepresented among the most differentially expressed genes. I think it’s worth separating them, to facilitate the biological interpretation. Note that not all terms listed in the tables below are significant. The scores for the three methods (elim, weight01 and lea) are non-corrected p-values.

Biological process

orderedTerms <- names(sort(score(BP.weight01)))
significant.weight01 <- score(BP.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(BP.lea)[orderedTerms] <= 0.01
significant.elim     <- score(BP.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
BP.pvalue.sigTerms <- sigTerms

BP.all <- GenTable(GOdata.BP, elim=BP.elim, weight01=BP.weight01, lea=BP.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))

kable(
   BP.all[BP.all$Significant > BP.all$Expected,],
   caption = "Most over-represented terms of the Biological Process ontology.")
Most over-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
1 GO:0001522 pseudouridine synthesis 31 4 1.56 2 0.0014 0.0014 0.00139
2 GO:0046854 phosphatidylinositol phosphorylation 45 4 2.26 4 0.0027 0.0027 0.00274
3 GO:0006400 tRNA modification 58 7 2.91 48 0.0157 0.0035 0.01574
4 GO:0050684 regulation of mRNA processing 15 3 0.75 13 0.0060 0.0041 0.00599
5 GO:0030513 positive regulation of BMP signaling pat… 9 4 0.45 9 0.0053 0.0053 0.00527
6 GO:0016579 protein deubiquitination 90 9 4.52 11 0.0058 0.0058 0.00577
7 GO:0060271 cilium assembly 73 4 3.67 8 0.0048 0.0060 0.00037
8 GO:0070286 axonemal dynein complex assembly 9 1 0.45 18 0.0078 0.0078 0.00778
9 GO:0048015 phosphatidylinositol-mediated signaling 40 4 2.01 22 0.0087 0.0087 0.00873
10 GO:0055072 iron ion homeostasis 21 3 1.06 575 0.3722 0.0095 0.37225
11 GO:0007411 axon guidance 11 3 0.55 30 0.0119 0.0119 0.01186
13 GO:0015937 coenzyme A biosynthetic process 8 1 0.40 41 0.0130 0.0130 0.01296
14 GO:0046434 organophosphate catabolic process 52 3 2.61 924 0.6194 0.0155 0.61935
15 GO:0060828 regulation of canonical Wnt signaling pa… 5 2 0.25 45 0.0157 0.0157 0.01569
18 GO:0048017 inositol lipid-mediated signaling 50 5 2.51 51 0.0178 0.0178 0.00266
19 GO:0006654 phosphatidic acid biosynthetic process 10 1 0.50 52 0.0178 0.0178 0.01785
20 GO:0007275 multicellular organism development 85 7 4.27 136 0.0616 0.0185 0.00270
21 GO:0006511 ubiquitin-dependent protein catabolic pr… 204 12 10.25 375 0.2410 0.0187 0.24104
22 GO:0007040 lysosome organization 8 2 0.40 55 0.0195 0.0195 0.01949
kable(
   BP.all[BP.all$Significant < BP.all$Expected,],
   caption = "Most under-represented terms of the Biological Process ontology.")
Most under-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
12 GO:0006418 tRNA aminoacylation for protein translat… 87 2 4.37 84 0.0368 0.0122 0.03681
16 GO:0050953 sensory perception of light stimulus 11 0 0.55 852 0.5606 0.0165 0.56064
17 GO:0006030 chitin metabolic process 103 5 5.17 128 0.0581 0.0175 0.05810
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term = Term(GOTERM[sigTerms]),
                 Definition = Definition(GOTERM[sigTerms]),
                 PValue=score(BP.weight01)[sigTerms]),
      caption = paste('Biological process terms significantly associated with',
                      VAR, 'according to all 3 algorithms', sep=' '))
Biological process terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0001522 pseudouridine synthesis The intramolecular conversion of uridine to pseudouridine within an RNA molecule. This posttranscriptional base modification occurs in tRNA, rRNA, and snRNAs. 0.0013926
GO:0046854 phosphatidylinositol phosphorylation The process of introducing one or more phosphate groups into a phosphatidylinositol, any glycerophosphoinositol having one phosphatidyl group esterified to one of the hydroxy groups of inositol. 0.0027365
GO:0050684 regulation of mRNA processing Any process that modulates the frequency, rate or extent of mRNA processing, those processes involved in the conversion of a primary mRNA transcript into a mature mRNA prior to its translation into polypeptide. 0.0040779
GO:0030513 positive regulation of BMP signaling pathway Any process that activates or increases the frequency, rate or extent of BMP signaling pathway activity. 0.0052673
GO:0016579 protein deubiquitination The removal of one or more ubiquitin groups from a protein. 0.0057656
GO:0060271 cilium assembly The assembly of a cilium, a specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface. Each cilium is bounded by an extrusion of the cytoplasmic membrane, and contains a regular longitudinal array of microtubules, anchored basally in a centriole. 0.0059768
GO:0070286 axonemal dynein complex assembly The aggregation, arrangement and bonding together of a set of components to form an axonemal dynein complex, a dynein complex found in eukaryotic cilia and flagella, in which the motor domain heads interact with adjacent microtubules to generate a sliding force which is converted to a bending motion. 0.0077824
GO:0048015 phosphatidylinositol-mediated signaling A series of molecular signals in which a cell uses a phosphatidylinositol-mediated signaling to convert a signal into a response. Phosphatidylinositols include phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0087267

I think the GO graph is useful to see the relationship among the significant terms. But too large graphs are impossible to read. I don’t know how to split the graph in meaningful subgraphs.

showSigOfNodes(GOdata.BP, score(BP.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 266 
## Number of Edges = 512 
## 
## $complete.dag
## [1] "A graph with 266 nodes."

This is just a example of the most significant GO term:

showGroupDensity(GOdata.BP, names(score(BP.weight01))[which.min(score(BP.weight01))])

Molecular function

orderedTerms <- names(sort(score(MF.weight01)))
significant.weight01 <- score(MF.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(MF.lea)[orderedTerms] <= 0.01
significant.elim     <- score(MF.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
MF.pvalue.sigTerms <- sigTerms

MF.all <- GenTable(GOdata.MF, elim=MF.elim, weight01=MF.weight01, lea=MF.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))

kable(
   MF.all[MF.all$Significant > MF.all$Expected,],
   caption = "Most over-represented terms of the Molecular Function ontology.")
Most over-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
1 GO:0016747 transferase activity, transferring acyl … 158 9 7.81 142 0.17607 0.00023 0.17607
3 GO:0005509 calcium ion binding 638 34 31.52 3 0.00086 0.00086 0.00086
4 GO:0035091 phosphatidylinositol binding 109 9 5.39 1 0.00043 0.00091 0.00043
7 GO:0009982 pseudouridine synthase activity 27 2 1.33 5 0.00277 0.00277 0.00277
8 GO:0060090 molecular adaptor activity 45 6 2.22 17 0.01348 0.00450 0.01348
9 GO:0005515 protein binding 4270 227 210.99 27 0.02374 0.00452 0.01001
10 GO:0016788 hydrolase activity, acting on ester bond… 539 30 26.63 52 0.06374 0.00478 0.03279
kable(
   MF.all[MF.all$Significant < MF.all$Expected,],
   caption = "Most under-represented terms of the Molecular Function ontology.")
Most under-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
2 GO:0140101 catalytic activity, acting on a tRNA 139 4 6.87 2 0.00065 0.00035 0.00065
5 GO:0004812 aminoacyl-tRNA ligase activity 90 2 4.45 22 0.01794 0.00098 0.01794
6 GO:0004594 pantothenate kinase activity 5 0 0.25 4 0.00254 0.00254 0.00254
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(MF.weight01)[sigTerms]),
      caption = paste('Molecular function terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Molecular function terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0140101 catalytic activity, acting on a tRNA NA 0.0003490
GO:0005509 calcium ion binding Interacting selectively and non-covalently with calcium ions (Ca2+). 0.0008562
GO:0035091 phosphatidylinositol binding Interacting selectively and non-covalently with any inositol-containing glycerophospholipid, i.e. phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0009095
GO:0004594 pantothenate kinase activity Catalysis of the reaction: ATP + pantothenate = ADP + D-4’-phosphopantothenate. 0.0025374
GO:0009982 pseudouridine synthase activity Catalysis of the reaction: RNA uridine = RNA pseudouridine. Conversion of uridine in an RNA molecule to pseudouridine by rotation of the C1’-N-1 glycosidic bond of uridine in RNA to a C1’-C5. 0.0027714
GO:0008417 fucosyltransferase activity Catalysis of the transfer of a fucosyl group to an acceptor molecule, typically another carbohydrate or a lipid. 0.0048171
GO:0004630 phospholipase D activity Catalysis of the reaction: a phosphatidylcholine + H2O = choline + a phosphatidate. 0.0055804
showSigOfNodes(GOdata.MF, score(MF.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 38 
## Number of Edges = 41 
## 
## $complete.dag
## [1] "A graph with 38 nodes."
showGroupDensity(GOdata.MF, names(score(MF.weight01))[which.min(score(MF.weight01))])

Cellular component

orderedTerms <- names(sort(score(CC.weight01)))
significant.weight01 <- score(CC.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(CC.lea)[orderedTerms] <= 0.01
significant.elim     <- score(CC.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
CC.pvalue.sigTerms <- sigTerms

CC.all <- GenTable(GOdata.CC, elim=CC.elim, weight01=CC.weight01, lea=CC.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
   CC.all[CC.all$Significant > CC.all$Expected,],
   caption = "Most over-represented terms of the Cellular Component ontology.")
Most over-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:1902555 endoribonuclease complex 10 1 0.48 10 0.020 0.0041 0.020
kable(
   CC.all[CC.all$Significant < CC.all$Expected,],
   caption = "Most under-represented terms of the Cellular Component ontology.")
Most under-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
2 GO:0045211 postsynaptic membrane 28 1 1.35 6 0.015 0.0150 0.015
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(CC.weight01)[sigTerms]),
      caption = paste('Cellular component terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Cellular component terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
showSigOfNodes(GOdata.CC, score(CC.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 18 
## Number of Edges = 28 
## 
## $complete.dag
## [1] "A graph with 18 nodes."
showGroupDensity(GOdata.CC, names(score(CC.weight01))[which.min(score(CC.weight01))])

Using the portion of variance explained by hatching

Building the topGO object

I need to generate the topGO objects again, using the alternative gene ordering, based on the proportion of expression-level variance explained by hatching. I miss a way to inform the topGOdata object that the score in this case is reversed, with respect to p-values: the higher it is, the more differentially expressed the gene is. To make sure that GO terms are tested in the same way than when using p-values, I will just reverse the proportion of variance explained by hatching to its complement. Taking this into account, the subset of interesting genes (selection() function) must be defined as the lowest 10% scores, which are the 10% genes with largest portion of variance explained by hatching.

selection <- function(allScores) {return(allScores <= quantile(allScores, probs = 0.10))}
GOdataVar.BP <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'BP', 
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdataVar.MF <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'MF',
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdataVar.CC <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'CC',
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

library(knitr)
DataSummary <- data.frame(ontology = c('BP', 'MF', 'CC'),
   Num_Genes = sapply(list(GOdataVar.BP, GOdataVar.MF, GOdataVar.CC), numGenes),
   Num_GO_terms = sapply(list(GOdataVar.BP, GOdataVar.MF, GOdataVar.CC), function(x) length(usedGO(x))))
kable(DataSummary, caption='Number of feasible genes or transcripts and number of GO terms used in each data set.')
rm(DataSummary)
Number of feasible genes or transcripts and number of GO terms used in each data set.
ontology Num_Genes Num_GO_terms
BP 26235 1582
MF 30580 791
CC 21616 390

Running the tests

BPvar.elim     <- runTest(GOdataVar.BP, algorithm = "elim",     statistic = "ks")
BPvar.weight01 <- runTest(GOdataVar.BP, algorithm = "weight01", statistic = "ks")
BPvar.lea      <- runTest(GOdataVar.BP, algorithm = "lea",      statistic = "ks")
MFvar.elim     <- runTest(GOdataVar.MF, algorithm = "elim",     statistic = "ks")
MFvar.weight01 <- runTest(GOdataVar.MF, algorithm = "weight01", statistic = "ks")
MFvar.lea      <- runTest(GOdataVar.MF, algorithm = "lea",      statistic = "ks")
CCvar.elim     <- runTest(GOdataVar.CC, algorithm = "elim",     statistic = "ks")
CCvar.weight01 <- runTest(GOdataVar.CC, algorithm = "weight01", statistic = "ks")
CCvar.lea      <- runTest(GOdataVar.CC, algorithm = "lea",      statistic = "ks")

ResultsSummary <- data.frame(ontology = rep(c("BP", "MF", "CC"), each = 3),
   algorithm = rep(c("elim", "weight01", "lea"), 3),
   TermsTested = sapply(list(BPvar.elim, BPvar.weight01, BPvar.lea, MFvar.elim, MFvar.weight01, MFvar.lea, CCvar.elim, CCvar.weight01, CCvar.lea), function(X) length(score(X))),
   Significant = sapply(list(BPvar.elim, BPvar.weight01, BPvar.lea, MFvar.elim, MFvar.weight01, MFvar.lea, CCvar.elim, CCvar.weight01, CCvar.lea), function(X) sum(score(X) < 0.01)))

kable(ResultsSummary, caption="Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.")
Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.
ontology algorithm TermsTested Significant
BP elim 1582 27
BP weight01 1582 16
BP lea 1582 47
MF elim 791 19
MF weight01 791 15
MF lea 791 22
CC elim 390 11
CC weight01 390 8
CC lea 390 17
rm(ResultsSummary)

Results

Biological process

orderedTerms <- names(sort(score(BPvar.weight01)))
significant.weight01 <- score(BPvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(BPvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(BPvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
BP.variance.sigTerms <- sigTerms

BPvar.all <- GenTable(GOdataVar.BP, elim=BPvar.elim, weight01=BPvar.weight01, lea=BPvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))

kable(
   BPvar.all[BPvar.all$Significant > BPvar.all$Expected,],
   caption = "Most over-represented terms of the Biological Process ontology.")
Most over-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
1 GO:0034474 U2 snRNA 3’-end processing 17 6 1.70 1 5e-05 5e-05 5e-05
2 GO:0044782 cilium organization 99 10 9.89 215 0.0742 0.00025 0.0742
3 GO:0001522 pseudouridine synthesis 31 9 3.10 2 0.0010 0.00102 0.0010
4 GO:0046854 phosphatidylinositol phosphorylation 45 14 4.50 3 0.0011 0.00107 0.0011
5 GO:0048015 phosphatidylinositol-mediated signaling 40 13 4.00 5 0.0013 0.00132 0.0013
6 GO:0043547 positive regulation of GTPase activity 27 8 2.70 6 0.0016 0.00159 0.0016
7 GO:0032324 molybdopterin cofactor biosynthetic proc… 5 3 0.50 7 0.0016 0.00162 0.0016
8 GO:0006400 tRNA modification 58 10 5.80 289 0.1084 0.00246 0.2870
9 GO:0008608 attachment of spindle microtubules to ki… 6 3 0.60 13 0.0043 0.00426 0.0043
10 GO:0043063 intercellular bridge organization 6 3 0.60 14 0.0043 0.00426 0.0043
11 GO:0006336 DNA replication-independent nucleosome a… 13 3 1.30 15 0.0045 0.00451 0.0045
12 GO:0007264 small GTPase mediated signal transductio… 228 32 22.79 26 0.0085 0.00588 0.0085
13 GO:0035307 positive regulation of protein dephospho… 5 2 0.50 18 0.0061 0.00614 0.0061
14 GO:0050953 sensory perception of light stimulus 11 2 1.10 971 0.5934 0.00712 0.5934
16 GO:0010256 endomembrane system organization 61 10 6.10 287 0.1076 0.00803 0.2482
17 GO:0071900 regulation of protein serine/threonine k… 20 5 2.00 689 0.3791 0.01059 0.3791
18 GO:0000183 chromatin silencing at rDNA 5 1 0.50 31 0.0121 0.01209 0.0121
19 GO:0006997 nucleus organization 5 2 0.50 35 0.0128 0.01285 0.0128
20 GO:0042147 retrograde transport, endosome to Golgi 26 5 2.60 41 0.0148 0.01477 0.0148
21 GO:0015937 coenzyme A biosynthetic process 8 2 0.80 42 0.0148 0.01478 0.0148
22 GO:0022008 neurogenesis 39 9 3.90 12 0.0036 0.01516 0.0036
23 GO:0032367 intracellular cholesterol transport 5 3 0.50 53 0.0176 0.01764 0.0176
24 GO:0098535 de novo centriole assembly involved in m… 5 1 0.50 56 0.0182 0.01817 0.0182
25 GO:0016579 protein deubiquitination 90 12 8.99 61 0.0187 0.01866 0.0187
26 GO:0006777 Mo-molybdopterin cofactor biosynthetic p… 8 2 0.80 64 0.0201 0.02010 0.0201
27 GO:0006310 DNA recombination 89 12 8.89 232 0.0839 0.02271 0.0839
kable(
   BPvar.all[BPvar.all$Significant < BPvar.all$Expected,],
   caption = "Most under-represented terms of the Biological Process ontology.")
Most under-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
15 GO:0016573 histone acetylation 31 3 3.1 21 0.0074 0.00736 0.0074
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(BPvar.weight01)[sigTerms]),
      caption = paste('Biological process terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Biological process terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0034474 U2 snRNA 3’-end processing Any process involved in forming the mature 3’ end of a U2 snRNA molecule. 0.0000497
GO:0001522 pseudouridine synthesis The intramolecular conversion of uridine to pseudouridine within an RNA molecule. This posttranscriptional base modification occurs in tRNA, rRNA, and snRNAs. 0.0010244
GO:0046854 phosphatidylinositol phosphorylation The process of introducing one or more phosphate groups into a phosphatidylinositol, any glycerophosphoinositol having one phosphatidyl group esterified to one of the hydroxy groups of inositol. 0.0010707
GO:0048015 phosphatidylinositol-mediated signaling A series of molecular signals in which a cell uses a phosphatidylinositol-mediated signaling to convert a signal into a response. Phosphatidylinositols include phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0013211
GO:0043547 positive regulation of GTPase activity Any process that activates or increases the activity of a GTPase. 0.0015858
GO:0032324 molybdopterin cofactor biosynthetic process The chemical reactions and pathways resulting in the formation of the molybdopterin cofactor (Moco), essential for the catalytic activity of some enzymes, e.g. sulfite oxidase, xanthine dehydrogenase, and aldehyde oxidase. The cofactor consists of a mononuclear molybdenum (Mo-molybdopterin) or tungsten ion (W-molybdopterin) coordinated by one or two molybdopterin ligands. 0.0016215
GO:0008608 attachment of spindle microtubules to kinetochore The process in which spindle microtubules become physically associated with the proteins making up the kinetochore complex. 0.0042561
GO:0043063 intercellular bridge organization A process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the intracellular bridge. An intracellular bridge is a direct link between the cytoplasms of sister cells that allows cells to communicate with one another. 0.0042561
GO:0006336 DNA replication-independent nucleosome assembly The formation of nucleosomes outside the context of DNA replication. 0.0045060
GO:0007264 small GTPase mediated signal transduction Any series of molecular signals in which a small monomeric GTPase relays one or more of the signals. 0.0058771
GO:0035307 positive regulation of protein dephosphorylation Any process that activates or increases the frequency, rate or extent of removal of phosphate groups from a protein. 0.0061426
GO:0016573 histone acetylation The modification of a histone by the addition of an acetyl group. 0.0073586
showSigOfNodes(GOdataVar.BP, score(BPvar.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 289 
## Number of Edges = 579 
## 
## $complete.dag
## [1] "A graph with 289 nodes."

Below I plot variance portion, but for the termo found most significant when using p-values, for comparison.

showGroupDensity(GOdataVar.BP, names(score(BP.weight01))[which.min(score(BP.weight01))])

Molecular function

orderedTerms <- names(sort(score(MFvar.weight01)))
significant.weight01 <- score(MFvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(MFvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(MFvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
MF.variance.sigTerms <- sigTerms

MFvar.all <- GenTable(GOdataVar.MF, elim=MFvar.elim, weight01=MFvar.weight01, lea=MFvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(
   MFvar.all[MFvar.all$Significant > MFvar.all$Expected,],
   caption = "Most over-represented terms of the Molecular Function ontology.")
Most over-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:0005515 protein binding 4270 463 421.41 2 0.00050 0.00011 4.2e-05
GO:0060090 molecular adaptor activity 45 16 4.44 1 0.00014 0.00014 1.2e-05
GO:0000166 nucleotide binding 2277 277 224.72 68 0.05260 0.00019 0.01496
GO:0016417 S-acyltransferase activity 6 5 0.59 3 0.00079 0.00079 0.00079
GO:0009982 pseudouridine synthase activity 27 7 2.66 4 0.00116 0.00116 0.00116
GO:0005524 ATP binding 1688 215 166.59 6 0.00260 0.00260 0.00260
GO:0003887 DNA-directed DNA polymerase activity 43 11 4.24 7 0.00310 0.00310 0.00310
GO:0004386 helicase activity 94 18 9.28 16 0.00846 0.00343 0.00846
GO:0016748 succinyltransferase activity 5 4 0.49 10 0.00430 0.00430 0.00430
GO:0008138 protein tyrosine/serine/threonine phosph… 45 5 4.44 11 0.00478 0.00478 0.00478
GO:0140101 catalytic activity, acting on a tRNA 139 14 13.72 318 0.39110 0.00603 0.94371
GO:0004721 phosphoprotein phosphatase activity 139 18 13.72 157 0.18331 0.00626 0.18331
GO:0070840 dynein complex binding 6 2 0.59 17 0.00967 0.00967 0.00967
GO:0008093 cytoskeletal adaptor activity 6 2 0.59 18 0.00967 0.00967 0.00967
GO:0046527 glucosyltransferase activity 17 5 1.68 57 0.04767 0.00992 0.04767
GO:0035091 phosphatidylinositol binding 109 22 10.76 15 0.00815 0.01428 0.00815
GO:0043015 gamma-tubulin binding 23 5 2.27 24 0.01438 0.01438 0.01438
GO:0016301 kinase activity 866 119 85.47 25 0.01678 0.01678 0.01678
GO:0008483 transaminase activity 32 5 3.16 21 0.01074 0.01721 0.01074
kable(
   MFvar.all[MFvar.all$Significant < MFvar.all$Expected,],
   caption = "Most under-represented terms of the Molecular Function ontology.")
Most under-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(MFvar.weight01)[sigTerms]),
      caption = paste('Molecular functions terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Molecular functions terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0005515 protein binding Interacting selectively and non-covalently with any protein or protein complex (a complex of two or more proteins that may include other nonprotein molecules). 0.0001129
GO:0060090 molecular adaptor activity The binding activity of a molecule that brings together two or more molecules through a selective, non-covalent, often stoichiometric interaction, permitting those molecules to function in a coordinated way. 0.0001387
GO:0016417 S-acyltransferase activity Catalysis of the transfer of an acyl group to a sulfur atom on the acceptor molecule. 0.0007907
GO:0009982 pseudouridine synthase activity Catalysis of the reaction: RNA uridine = RNA pseudouridine. Conversion of uridine in an RNA molecule to pseudouridine by rotation of the C1’-N-1 glycosidic bond of uridine in RNA to a C1’-C5. 0.0011590
GO:0005524 ATP binding Interacting selectively and non-covalently with ATP, adenosine 5’-triphosphate, a universally important coenzyme and enzyme regulator. 0.0026010
GO:0003887 DNA-directed DNA polymerase activity Catalysis of the reaction: deoxynucleoside triphosphate + DNA(n) = diphosphate + DNA(n+1); the synthesis of DNA from deoxyribonucleotide triphosphates in the presence of a DNA template and a 3’hydroxyl group. 0.0031021
GO:0004386 helicase activity Catalysis of the reaction: NTP + H2O = NDP + phosphate, to drive the unwinding of a DNA or RNA helix. 0.0034318
GO:0016748 succinyltransferase activity Catalysis of the transfer of a succinyl (3-carboxypropanoyl) group to an acceptor molecule. 0.0043016
GO:0008138 protein tyrosine/serine/threonine phosphatase activity Catalysis of the reactions: protein serine + H2O = protein serine + phosphate; protein threonine phosphate + H2O = protein threonine + phosphate; and protein tyrosine phosphate + H2O = protein tyrosine + phosphate. 0.0047760
GO:0070840 dynein complex binding Interacting selectively and non-covalently with a dynein complex, a protein complex that contains two or three dynein heavy chains and several light chains, and has microtubule motor activity. 0.0096650
GO:0008093 cytoskeletal adaptor activity The binding activity of a molecule that brings together a cytoskeletal protein and one or more other molecules, permitting them to function in a coordinated way. 0.0096650
showSigOfNodes(GOdataVar.MF, score(MFvar.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 64 
## Number of Edges = 76 
## 
## $complete.dag
## [1] "A graph with 64 nodes."

I plot variance portion, but for the term found most significant when using p-values, for comparison.

showGroupDensity(GOdataVar.MF, names(score(MF.weight01))[which.min(score(MF.weight01))])

Cellular component

orderedTerms <- names(sort(score(CCvar.weight01)))
significant.weight01 <- score(CCvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(CCvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(CCvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
if (length(sigTerms) == 0) {
   sigTerms <- orderedTerms[significant.lea & significant.elim]
}
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
CC.variance.sigTerms <- sigTerms

CCvar.all <- GenTable(GOdataVar.CC, elim=CCvar.elim, weight01=CCvar.weight01, lea=CCvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=max(sum(significant.elim), 10))

kable(
   CCvar.all[CCvar.all$Significant > CCvar.all$Expected,],
   caption = "Most over-represented terms of the Cellular Component ontology.")
Most over-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:0036064 ciliary basal body 14 5 1.33 3 0.00136 0.0014 0.00136
GO:0031981 nuclear lumen 280 38 26.68 2 0.00079 0.0028 0.00079
GO:1990204 oxidoreductase complex 28 10 2.67 1 0.00061 0.0029 0.00061
GO:0045239 tricarboxylic acid cycle enzyme complex 5 4 0.48 5 0.00415 0.0041 0.00415
GO:0035658 Mon1-Ccz1 complex 10 3 0.95 6 0.00487 0.0049 0.00487
GO:0005815 microtubule organizing center 96 16 9.15 22 0.01319 0.0081 0.13310
GO:0005813 centrosome 50 10 4.76 10 0.00810 0.0081 0.00810
GO:0005634 nucleus 1025 119 97.68 11 0.00820 0.0092 0.00021
GO:0005720 nuclear heterochromatin 5 1 0.48 17 0.01151 0.0115 0.01151
GO:0008622 epsilon DNA polymerase complex 11 5 1.05 21 0.01289 0.0129 0.01289
GO:1902555 endoribonuclease complex 10 1 0.95 27 0.01995 0.0161 0.01995
kable(
   CCvar.all[CCvar.all$Significant < CCvar.all$Expected,],
   caption = "Most under-represented terms of the Cellular Component ontology.")
Most under-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(CCvar.weight01)[sigTerms]),
      caption = paste('Cellular component terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Cellular component terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0036064 ciliary basal body A membrane-tethered, short cylindrical array of microtubules and associated proteins found at the base of a eukaryotic cilium (also called flagellum) that is similar in structure to a centriole and derives from it. The cilium basal body is the site of assembly and remodelling of the cilium and serves as a nucleation site for axoneme growth. As well as anchoring the cilium, it is thought to provide a selective gateway regulating the entry of ciliary proteins and vesicles by intraflagellar transport. 0.0013574
GO:0031981 nuclear lumen The volume enclosed by the nuclear inner membrane. 0.0027521
GO:1990204 oxidoreductase complex Any protein complex that possesses oxidoreductase activity. 0.0029066
GO:0045239 tricarboxylic acid cycle enzyme complex Any of the heteromeric enzymes that act in the TCA cycle. 0.0041466
GO:0035658 Mon1-Ccz1 complex A protein complex that functions as a guanine nucleotide exchange factor (GEF) and converts Rab-GDP to Rab-GTP. In S. cerevisiae, this complex consists of at least Mon1 and Ccz1, and serves as a GEF for the Rab Ypt7p. 0.0048672
GO:0005813 centrosome A structure comprised of a core structure (in most organisms, a pair of centrioles) and peripheral material from which a microtubule-based structure, such as a spindle apparatus, is organized. Centrosomes occur close to the nucleus during interphase in many eukaryotic cells, though in animal cells it changes continually during the cell-division cycle. 0.0081015
GO:0005634 nucleus A membrane-bounded organelle of eukaryotic cells in which chromosomes are housed and replicated. In most cells, the nucleus contains all of the cell’s chromosomes except the organellar chromosomes, and is the site of RNA synthesis and processing. In some species, or in specialized cell types, RNA metabolism or DNA replication may be absent. 0.0091673
showSigOfNodes(GOdataVar.CC, score(CCvar.weight01),
               firstSigNodes = sum(significant.weight01),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 46 
## Number of Edges = 77 
## 
## $complete.dag
## [1] "A graph with 46 nodes."

For comparison, I plot the distribution of variance portion for the CC term found most significant when using p-values.

showGroupDensity(GOdataVar.CC, names(score(CC.weight01))[which.min(score(CC.weight01))])

Comparison between the two ordering of genes.

Biological process

allTerms <- usedGO(GOdata.BP)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% BP.pvalue.sigTerms,
                             Variance = allTerms %in% BP.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(BP.weight01))[allTerms],
                         Variance = rank(score(BPvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth() + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of BP terms by significance')

Molecular function

allTerms <- usedGO(GOdata.MF)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% MF.pvalue.sigTerms,
                             Variance = allTerms %in% MF.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(MF.weight01))[allTerms],
                         Variance = rank(score(MFvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth() + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of MF terms by significance')

Cellular component

allTerms <- usedGO(GOdata.CC)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% CC.pvalue.sigTerms,
                             Variance = allTerms %in% CC.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(CC.weight01))[allTerms],
                         Variance = rank(score(CCvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth() + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of CC terms by significance')

Session info

I save the main variables to be able to load them in a new R session later.

save(allgenes2GO,
     GOdata.BP, BP.elim, BP.weight01, BP.lea, BP.pvalue.sigTerms,
     GOdata.MF, MF.elim, MF.weight01, MF.lea, MF.pvalue.sigTerms,
     GOdata.CC, CC.elim, CC.weight01, CC.lea, CC.pvalue.sigTerms,
     GOdataVar.BP, BPvar.elim, BPvar.weight01, BPvar.lea, BP.variance.sigTerms,
     GOdataVar.MF, MFvar.elim, MFvar.weight01, MFvar.lea, MF.variance.sigTerms,
     GOdataVar.CC, CCvar.elim, CCvar.weight01, CCvar.lea, CC.variance.sigTerms,
     file = paste('Enrichment', TAG, VAR, 'RData', sep='.'))
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=ca_ES.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=ca_ES.UTF-8    
##  [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=ca_ES.UTF-8   
##  [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
##  [1] grid      stats4    parallel  stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] Rgraphviz_2.30.0     ggplot2_3.2.1        limma_3.42.0        
##  [4] knitr_1.27           topGO_2.38.1         SparseM_1.78        
##  [7] GO.db_3.10.0         AnnotationDbi_1.48.0 IRanges_2.20.2      
## [10] S4Vectors_0.24.3     Biobase_2.46.0       graph_1.64.0        
## [13] BiocGenerics_0.32.0 
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_0.2.5   xfun_0.12          purrr_0.3.3        splines_3.6.2     
##  [5] lattice_0.20-38    colorspace_1.4-1   vctrs_0.2.1        htmltools_0.4.0   
##  [9] yaml_2.2.0         mgcv_1.8-31        blob_1.2.1         rlang_0.4.2       
## [13] pillar_1.4.3       glue_1.3.1         withr_2.1.2        DBI_1.1.0         
## [17] bit64_0.9-7        matrixStats_0.55.0 lifecycle_0.1.0    stringr_1.4.0     
## [21] munsell_0.5.0      gtable_0.3.0       memoise_1.1.0      evaluate_0.14     
## [25] labeling_0.3       highr_0.8          Rcpp_1.0.3         backports_1.1.5   
## [29] scales_1.1.0       farver_2.0.3       bit_1.1-15.1       digest_0.6.23     
## [33] stringi_1.4.5      dplyr_0.8.3        tools_3.6.2        magrittr_1.5      
## [37] lazyeval_0.2.2     tibble_2.1.3       RSQLite_2.2.0      crayon_1.3.4      
## [41] pkgconfig_2.0.3    zeallot_0.1.0      Matrix_1.2-18      assertthat_0.2.1  
## [45] rmarkdown_2.1      R6_2.4.1           nlme_3.1-143       compiler_3.6.2